Mass transportation with LQ cost functions 
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Abstract 

We study the optimal transport problem in the Euclidean space where the cost 
function is given by the value function associated with a Linear Quadratic minimiza- 
tion problem. Under appropriate assumptions, we generalize Brenier's Theorem prov- 
ing existence and uniqueness of an optimal transport map. In the controllable case, 
we show that the optimal transport map has to be the gradient of a convex function 
up to a linear change of coordinates. We give regularity results and also investigate 
the non-controllable case. 

1 Introduction 

The optimal transport problem can be stated as follows: given two probability measures 
liQ and Hi, defined on measurable spaces X and Y respectively and a cost function 

c : X xY — > M.U{+oo} 

find a measurable map 

T : X — y Y 
which pushes forward f^o to /ii, that is 

^t)/^o = fJ-i (i-e. fJ'i{B) = fiQ [T^^{B)) for all B CY measurable), 

and which minimizes the transportation cost 



costc(T') := / c{x,T{x)) dfiQ{x). 
Jx 



When the transport condition Tjj/xo = /^i is satisfied, we say that T is a transport map, 
and if T minimizes also the cost, that is if 



costc(T) = min < costc(5') > 
s^t^lo=^J■l J 



5tj/io=M 

then we call it an optimal transport map. Since the seminal famous paper by Gaspard 
Monge in 1781 [12], there was a revival of interest in mass transportation in the nineties. 
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In 1987 [31 H], Brenier proved an existence and uniqueness result of optimal transport 
maps for the cost c{x,y) = |x — in H" and showed that any such optimal transport 
map is indeed the gradient of a convex function. Since then, people extended the theory 
to other costs functions in R" or to other types of spaces (see |I5]). 

The aim of this paper is to study existence, uniqueness, and regularity of optimal 
transport maps for costs functions coming for LQ minimization problems in H". Let us 
consider a linear control system of the form 

x = Ax + Bu (1.1) 

where the state x belongs to M", the control u belongs to and A, B are n x n and nxm 
matrices respectively. For every initial state x G IR" and every control u G L'^{[0, 1]; H™"), 
we denote hy x{-;x,u) : [0,1] H" the unique solution to the Cauchy problem 

J x{t) = Ax{t) + Bu{t) for a.e. t G [0, 1], , . 

{ x{0)=x. ^^-^^ 

Let us in addition consider a quadratic Lagrangian of the form 

L{x,u) = ^{x,Wx) + ^{u,Uu), (1.3) 

where W is & nx n symmetric non-negative matrix and U is a m x m symmetric positive 
definite matrix. The cost c : H*^ x H" — t- [0, +c«] associated with (jl.ip and ()1.3p is given 
by 

c(x,y) := inf 1^ L{x{t;x,u),u{t))dt\u e L'^ {[0,1];'^'^) s.t. x(l; x, u) = y| , (1.4) 

where we set c{x,y) = +oo if there is no u G ^^([0, 1];!!™') such that x{l;x,u) = y. 

We notice that LQ costs as above include as a particular case the Euclidean cost 
l/2|a; — by taking B = /„, U = In, A = W = ^ . Costs coming from optimal 
control have already been studied in [1] (see also [6j). However, this reference does not 
give regularity results or properties relating the transport map with the gradient of a 
convex function. Furthermore, the study of a cost that is finite only for points lying in 
the same leaf of a foliation, like in the non-controllable case is also original. 

The structure of the paper is the following. Our results are stated in Section [2j In 
Section [3l we introduce some preliminaries in optimal transport theory concerned with 
Kantorovitch duality. In Sections H] and [5l we provide the proofs of our results. Then, 
we present some examples in Section O Finally in Section [71 we conclude with several 
remarks about our results. 

2 Main results 

2.1 Preliminaries on linear systems 

Consider system (|1.1|) and let V be the smallest linear subspace of M" that contains the 
image of the operator B : H™ — )• H" and is invariant by A, or, more explicitly, 

y = Spanj^|B,^S,A2^,...,A"-^5| c R", d = diiiiV. (2.1) 
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The well-known Kalman criterion states that system (jl.ip is controllable if and only if d; 
this is stated below with a description of the situation where d < n. This decomposition 
can be found for instance in [131 Lemma 3.3.3]. 

Proposition 2.1. Let x,y be in . There exists a control u € ([0, 1]; -ff?"^) such that 
x{t;x,u) = y if and only if e^x — y € V (i.e. y lies in the affine subspace e'^x + V). 
In particular it exists for all x,y if d; the system -or the pair {A,B)- is then called 
controllable. 

If d < n, after a linear change of coordinates in IBP", the control system has the 

following form: 

x=(^.'\ = (''} ^^V^^Uf^ri, (2.2) 



±2) \ A2J \X2 J \ 

where the state x is partitioned into two blocks xi and X2 of dimension d and n — d 
respectively and Ai, A2, A3, Bi are d x d, d x n — d, n — d x n — d and d x m matrices 
respectively, such that the pair {Ai,Bi) is controllable. 



2.2 The controllable case 

The following result of existence, uniqueness and regularity follows easily from the classical 
theory of optimal mass transportation. Throughout the paper, M* denotes the transpose 
of the matrix M. 

Theorem 2.2. Assume that the linear control system U.l\) is controllable. Then there 
are symmetric positive definite n x n matrices D and F , and an invertible n x n matrix 
E such that 

cix,y) = ^{x,Dx)-{x,Ey) + ^{y,Fy) Vx,2/eiR". (2.3) 

Let jUo,^i be two compactly .supported probability measures on IR". Assume that //q 'i-s 
absolutely continuous with respect to the Lebesgue measure . Then, there is existence 
and uniqueness of an optimal transport map T : IR"" — > M"" . Such a map is characterized 
by the existence of a convex function ip : IR^ — > IR such that 

T{x) = E-^ Vip{x) for a.e. x G 1?". (2.4) 

// in addition, ^0,^1 are associated with probability densities /o,/i on supp(^o) CLf^d 
supp(/Lii) respectively, with fo and fi bounded from below and above, and if supp(/xo) 
is connected and supp(/ii) convex, then T is continuous. 

We postpone several remarks concerning Theorem 12.21 to Section [71 



2.3 The non-controllable case 

Here d < n in ()2.ip . hence the evolution of the (n — (i)-dimensional X2 is totally fixed 
by (j2.2p and c(x, y) will obviously be infinite for almost all pairs x, y: these such that 
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y2 7^ 6^X2- A result similar to the above theorem requires first there exists at least one 
transport map with finite cost, i.e. 

inf < costc(5') := / c(x, ^(x)) (i/Uo(x), > < +00. (2.5) 

Let 7^2 '■ IR" — > 1R"~'^ be the projection on the second block in the coordinates of ()2.2p : 
'^2{x) = X2 (it is also the projection on the quotient R"/!/). 

Theorem 2.3. Let /io and ni be compactly supported probability measures with continuous 
densities /o, /i (fJ-o = /o>C",/ii = f\L^). There exists a transport map with finite cost \2. 5|] 
if and only if 

(vr2)tt/^i = (e"^' O7r2)jj/xo. (2.6) 

If this is satisfied, then there exists a unique optimal transport map T : M"" — )• JR"". 
Moreover, if both densities /o,/i are bounded from below and above on supp(;Uo) and 
supp(;Ui) respectively, and «/supp(/io) and supp{fj,i) are both convex, then T is continuous. 

Again, remarks concerning Theorem 12.31 are postponed to Section [71 



3 Preliminaries in optimal transport theory 

Given two probability measures /xq, fJ-i on IR*^ and a cost function c : IR" x IR" —5- [0, +00], 
we are looking for a transport map T : IR" — )• H" which minimizes the transportation 
cost J^n c{x,T{x)) dfiQ. The constraint T^^o = fJ^i being highly non-linear, the optimal 
transport problem is quite difficult from the viewpoint of calculus of variation. The major 
advance on this problem was due to Kantorovich, who proposed in [8, 9j a notion of weak 
solution of the optimal transport problem. He suggested to look for plans instead of 
transport maps, that is probability measures 7 in IR*^ x H" whose marginals are /xq and 
/ii, that is 

(^i)t)7 = fJ-o and (vr2)tj7 = m, 

where vri : H" x IRJ^ — IRJ^ are the canonical projections on the first and second vari- 
able respectively. Denoting by n(//o,Aii) the set of plans, the new minimization problem 
becomes the following: 

C{no,Hi)= min \ c{x,y) d-/{x,y)\ . (3.1) 

If 7 is a minimizer for the Kantorovich formulation, we say that it is an optimal plan. 
Due to the linearity of the constraint 7 G n(/io, Aii), it is simple, using weak topologies, to 
prove existence of solutions to ()3.ip as soon as c is lower semi-continuous (see for instance 
|15j). The connection between the formulation of Kantorovich and that of Monge can be 
seen by noticing that any transport map T induces the plan defined by {Id x T)^fj,Q which 
is concentrated on the graph of T. Thus, the problem of showing existence of optimal 
transport maps can be reduced to prove that an optimal transport plan is concentrated 
on a graph. Moreover, if one can show that any optimal plan is concentrated on a graph, 
since 2i±22 optimal if so are 71 and 72, uniqueness of the transport map easily follows. 
The following definition is exactly [151 Definition 5.2]: 
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Definition 3.1. Let c : R" x [0, oo]. A function V : R" RU {+00} is said to be 

c-convex if it is not identically +00 and there exists a function ^ : R" ^ R U {—00, +00} 
such that 

tl^ix) = sup (c{y) - c{x,y)) Vx G R". 
Then its c-transform is the function ip'^ defined by 

r{y) = inf U{x) + c{x, y)) Vy G R", 
and its c-subdijferential is the set defined by 

dc^P := [{x,y) G R" X R" - = c(x,2/)}. 

Moreover, the c-subdifferential of ip at x (z R" is 

dMx) := {y GR"|(x,y) G^eV-}, 

or equivalently 

Tpix) + c{x,y) < ip{z) + c{z,y) Vz G R". 
The functions ip and ip^ are said to be c-conjugate. 

Let us denote by PcCR") the set of compactly supported probability measures. From 
now on, supp(/io) and supp(/ii) will denote the supports of fiQ and fii respectively, i.e. 
the smallest closed sets on which fiQ and ni are respectively concentrated. Kantorovitch 
duality can be stated as follows (see [151 Theorem 5.10]): 

Theorem 3.2. Let ^0;Mi ^ Pci^R"") and c : x [0,oo) he a lower continuous 

function. Then there exists a c-convex function tp : JR"" — )• M such that the following 
holds: a transport plan 7 G n(/io,/^i) is optimal if and only if ^(dctp) = 1 (that is, 7 is 
concentrated on the c-subdifferential of ip). Moreover ip can be chosen such that 

tp{x) = sup {^""{y) - c{x,y)} Vx G J?", (3.2) 

yGsupp(^ti) 

tp^iy) = inf {V'(x) + c(x,y)} Vy G ffi". (3.3) 

xGsupp(/io) 

By the above theorem we see that, in order to prove existence and uniqueness of 
optimal transport maps, it suffices to prove that there exist two Borel sets Zq,Zi C R", 
with ijlq{Zq) = Hi{Zi) = 1, such that and dc'ip is a graph inside Zq x Zi (or equivalently 
that dcip{x) n is a singleton for all x G Zq). 

4 Proof of Theorem [23] 

First, we prove that ()2.3p holds. This is not new, but the formulas are not given in 
textbooks in extenso. The case W = is more classical and one may find, for instance in 
|llj the expression of c in that case: 

c{x,y) = {x-e-^y,g-\x-e-'^y)) with g = [ e^^* BU-^B*e^'^dT (4.1) 

^0 
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where the controllabihty Grammian Q is positive definite because the system is control- 
lable. This is indeed of the form (|2.3|) . Let us derive the general form from the classical 
linear Pontryagin Maximum principle (see for instance the same reference). 

The pseudo Hamiltonian Hq : H" x x M™ — )• H associated with the optimization 
problem under study is given by 

Ho{x,p,u) := {p,Ax + Bu) - ^{x,Wx) - ^{u,Uu), (4.2) 

the control u that maximizes this expression for each (p, x) is given by 

u = U-^B*p (4.3) 

and the Hamiltonian H : x IR^ R, defined as H{x,p) = max{i?o(a;,P, li) | u G R™} 
is given by 

H{x,p) = {p,Ax) -'^{x.Wx) +'^{p,BU-^B*p). (4.4) 

Therefore the Hamiltonian differential equation x = dH/dp, p = —dH/dx associated to 
our minimization problem is given by 



(4.5) 



i \ _ ( A BU-^B* \ f X 
p ) ~ \ W -A* ) \ p 

Denote by R{-) : [0, 1] —?■ M2n(R) the fundamental solution to the Cauchy problem 



and write it as 



where Ri{-) is an x n matrix for z = 1, • • • 4. For every x G R" fixed, denote by exp^ the 
mapping from R" to R" which sends via the Hamiltonian vector field the initial adjoint 
vector p G R** to the final state x(l), that is 

exp^(p) := Ri{l)x + i?2(l)p Vp G R''. (4.7) 

Controllability of the system (jl.ip implies that the affine mapping exp^. is onto, and hence 
a bijection. Indeed, there is, for any x,y G R", (at least) one control u G ^^([0, 1];R'") 
which minimizes the cost 

L{x{t; X, u),u{t))dt 



among all controls steering x to y in time 1; thanks to the linear maximum principle, 
there corresponds to each minimizing control a p G R" such that exp^(p) = y. Hence the 
matrix R2{t) is invertible for all t. 

Given x and y in R", the cost between x and y is given by 

c(x,y) = r ^{x{t),Wx{t)) + ^{p{t),BU-'B*p{t))dt, 
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where : [0,1] H" x R" is the solution to the Hamiltonian system ()4.5p 

satisfying 

x(0) = X and exp^. (p(0)) = y. 

Set p : = p{0), that is 

p = R2{l)-^{y-Ri{l)xy (4.8) 
We deduce easily that c is a smooth function of the form 

c{x,y) = ^{x,Qix) + {x,Cp) + ^{p,Q2p) Vx,yGll", (4.9) 
where both Qi, Q2 are non-negative symmetric n x n matrices given by 

Qi := [Ri{tyWRi{t) + R^{tyBU-^B*R^{t)^dt, 

Q2 ■■= [R2{tTWR2{t) + Ri{tyBU-^B*Ri{t)^dt, 
and where C is the n x n matrix given by 

C := (^Ri{t)*WR2it) + R^{tyBU-^B*Ri{t)^dt. 

Let x,y (z H" and u € ([0, 1]; H™) a control minimizing c{x,y) be fixed. For every 
control V G L^{[0,l],n'^), denote by y{-,y,v) : [0,1] ^ R" the unique solution to the 
Cauchy problem 

f y{t) = Ay{t) + 5u(t) for a.e. t G [0, 1], 

I y(i) = y- 

By definition of the cost c, there holds 

/ L{y{t-y,v),v{t))dt-c{y{S)-y,v),y)>Q Vi; G ^^([0, 1],1R"^). 

JO 

Moreover, there is equality in the above inequality whenever v = u. Thanks to the linear 
maximum principle, since the cost c is smooth, this means that 

y = exp^(p) with p = -Vxc{x, y). 

Computing V^c(x, y) by differentiating ()4.9p - ()4.8p with respect to the variable x yields 

Qix + Cp- Ri{ir{R2{irT{C*x + Q2p) = -p, 

and finally (recall that Qi is symmetric) : 

C = -In + Ri{ir{R2il)-TQ2, Qi = Ri{ir{R2{l)-TC* = CR2{l)-'Ri{l). (4.10) 
Plugging this and ()4.8p into ()4.9p yields the expression ()2.3p for c(x, y) with 

D = R2{l)-^Riil), E = R2{l)-\ F = {R2{l)-^YQ2R2{ir\ (4.11) 
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where D is symmetric because ()4.10p implies Qi = —E+E*Q2E and Qi, Q2 are symmetric; 
Positive definiteness of D and F may be deduced from the fact that there is no solution 
of (jl.ip driving to a nonzero y or a nonzero x to in finite time with a zero cost. 

Let us now show the existence and uniqueness of an optimal transport map. Let 
Ho-, Hi be two compactly supported probability measures in R" and assume that ^0 is 
absolutely continuous with respect to the Lebesgue measure. Let ijj^ij) := ip'^ : H" IR 
be the Kantorovitch potentials given by Theorem 13.21 (note that c is non- negative valued) . 
First, since c is smooth and both sets supp(;Uo), supp(/xi) are assumed to be compact, 
(|3.2p - ()3.3p imply that both potentials ■0)'^ ^-re locally Lipschitz. Therefore, thanks to 
Rademacher's Theorem and the fact that /xq is absolutely continuous with respect to 
the Lebesgue measure, ip is differentiable /XQ-almost everywhere. Let x G supp(/io) be a 
differentiability point of ^p. Let y S supp(/ii) be such that 

(t){y) - i){x) = c{x,y), 

and u e L2([0, 1],]R™) be a control minimizing 

c{x,y) = m{ l^j L{x{t;x,v),v{t))dt\v £ L^dO,!];!^^) s.t. x{l;x,v)=y 

Then we have 
1 

L{yit;y,v),v{t))dt > c{yiO;y,v),y) > 0(y) - V(y(0; y, t;)) G ([0, 1]; R™) , 

with equality ii v = u. As above, this implies that 

y = exp^.(p) with p = V'>p{x). 

This shows that y is uniquely determined for every differentiability point of (p and in turn 
proves existence and uniqueness of an optimal measurable transport map x ^ y. Note 
that the potential ip can be written as follows: 

^(x) = sup {(t){y) - c{x,y)] 

yesupp(/ii) 

= -\{x,Dx)+ sup Ux,Ey) -]-{y,Fy) + (t){y) 
^ 2/esupp(^ti) I ^ 

This shows that the function : H defined by 

:= + ^(x,L>x) VxGlR'", 

is convex (as the swp of a family of convex functions) while, for almost every x G supp(/io), 
deriving the expression of exp^(p) from (|4.7p and (j4.1ip . 

T{x) = exp^(V^(x)) = E-^{Dx + VV'(x)) = E-^V(p{x) . 

It remains to show that under additional assumptions, T is continuous. One obviously 
has V{p{x) = E T{x) for all x; it is clear that this and Tj^o = imply (Vip)^fio = jli for 
the measure fii defined by 



(ii{A) = ^ll[{E-^x\x e A}) 
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for every Borel set A C H" {fii is the pushforward of fii by the map x i— t- Ex). Since (p is 
convex, Vip is then the optimal transport map from fiQ to fii for the cost c{x, y) = ^|x — yp. 
This imphes the continuity of V(/? — hence of T — according to |15t Theorem 12.50 (i)] 
which asserts the following: if /io, Ai are two compactly supported probability measures in 
associated with densities /o, /i which are bounded from below and above on supp(/o) 
and supp(/i) respectively, and if supp(/o) is connected and supp(/i) is convex, then the 
optimal transport map T : IR" — )• H" from (iq to with respect to the Euclidean quadratic 
cost c{x,y) = ^\x — is continuous. Take fii defined above and set fiQ = /io; they do 
satisfy these assumptions for and ni and applying an invertible linear map does not 
change convexity of the support of a density, and we just proved that T is V(/j with the 
same ip as above. 



5 Proof of Theorem [23] 



Let us first treat the case d = separately. According to (|2.ip . either there is no control 
(m = 0) or the matrix B is zero; in both cases the system reads x = Ax and 

i /J (e*^ x, VFe*^ (it if y = e^x , 
+00 otherwise. 



c{x,y) 



Then there is only one map S such that costc(5') < oo, given by S{x) = e^x for all x, the 
compatibility on measures is fii = S^fio with this precise S, and the result is proved in the 
case d = 0. 

Prom now on, we assume that d € {1, • • • ,n — 1}, i.e. none of the two blocks in (j2.2p is 
void {d> l,n-d> 1). For every x = (xi,X2) G R'^ x R""'^ = R" and n G L2([0, 1]; R'") , 
the solution x{-;x,u) : [0, 1] — )■ R" to the Cauchy problem ()1.2p with the decomposition 
given by (|2.2p satisfies 

xit;x,u)=M:^:i) = (e'''^-^ + e^^^J^e~^^jA,e^^^x, + BMs)]d^^ (S.^. 



X2{t;X2} 



e''^^X2 



for every t € [0, 1]. Denote by xi(-) = xi(^-;xi,u) : [0, 1] — )■ R*^ the solution to the Cauchy 
problem 



( S:i{t) = Aixi{t) + Biu{t) for a.e. t G [0,1], 
\ xi(0) = xi. 



Then we have 



xi (t; X, u) = xi (t; xi,uj + G{t)x2 Vt G [0, 1], 
where G : [0, 1] Mrf^„_rf(R) is defined by 

G{t) := e*^i /* e-'^'A^e'^^ds Vt G [0, 1]. 

JO 

Therefore we have for almost every t G [0, 1], 

L(^x{t;x,u),u{t)^ = - {x{t; x,u),Wx{t; x,u)) + -{u{t),Uu{t)) 

= ^ {xi{t;xi,u),WiXi{t;xi,u)) + ^{u{t),Uu{t)) 

+ {xi{t;xi,u),X[t]X2)) +I{t;x2), 
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where the symmetric matrix W is decomposed, in the coordinates of ()2.2p as 

W=( 

with Wi and W2 positive definite d x d and n — d x n — d matrices respectively, W3 a 
d X n — d matrix, and 

X{t-X2) = WiG{t)x2 + W3X2{t;X2) , 

1 G{t)x2 \ ( Ws\( G{t)x2 

,X2(t;X2)y' ' V W3* W2 J \x2{t;X2) 



l(t;x2) 



In conckision, we have for every x = {xi,X2),y = (^1,^2) S IR", 

g^^^y) ^ I Cx2(a:i,yi - G(l)x2) + /(J/(t;x2)dt if y2 = e^^^.^ 
' \ +00 otherwise, 

where the cost '■ x — > [0, +00) is defined by 

Ca;2(xi,Zi) : = 

infjy Lx2{t,xi{t;xi,u),u{t))dt\u e L'^{[0,1];'R'^) s.t. = zij , (5.4) 

for any xi,zi € H'^, and where the Lagrangian ■ H x K'^ x — )■ [0, +00) is defined 

by 

Lx2{t,z,u) ■.= ^{z,Wiz) + ^{uit),Uu{t)) + {z,X{t;x2)). (5.5) 



We proceed now as in the proof of Theorem 12.21 Let X2 G be fixed, the pseudo 

^X2 



Hamiltonian Hq -.IRx H'^ x M!^ x R"* — )• R associated with the cost is given by 



Ho{t,z,p,u) := {p,Aiz + Biu)-^{z,Wiz)-^{u,Uu)-{z,X{t;x2)). (5.6) 

Then ^ = yields u = IJ-^Blp and the Hamiltonian i7 : R x R"' x R'^ ^ R is given 
by 

H{t,z,p) = msiK{Ho{t,z,p,u)\u elR""} 

= {p, A,z)-^ {z, Wiz) + ^ {p, BiU-^Blp) - {z, X {t; X2) > . 

Therefore the Hamiltonian system associated to our minimization problem is given by 

i = Aiz + BiU-'Blp 

p = -Alp + Wiz + X{t;x2). ^ 



Denote by R{-) : [0, 1] M2d(R) the fundamental solution to the Cauchy problem 

m=(^\ ^'^'^*^')m VtG[0,l], R{0)=l2,, (5.8) 



10 



and write it as 



where Ri{-) is a d x d matrix for i = 1, • • • 4. Any solution : [0, 1] — )■ IR'' x 

of ()5.7p with z(0) = z,p{0) = p can be written as 

f Zit) = Ri{t)z + R2it)p + Z,,it) w,^r. 

\p{t) = R^{t)z + R^{t)p + p^,{t) ^^^LU'-^J' 

where {z^^{-),p^^{-)) : [0,1] ^ R'^ x R-^ of ([SZl) satisfying z^.^ = 0,^x2(0) = 0. For 
every xi € H*^, the mapping exp^ 

: ]Rd ^ ]Rrf defined by 
exp,^(p) ■.= Ri{l)xi+R2{l)p + z^,{^) VpGR^ 
is an affine bijection. For every xi,zi,p € R'^ with zi = exp^.^(p), there holds 

€2:2(2:1, zi) = ^ + ^ {p,Q2P) + {xi,Cp) + {xi,Vx2) + {P,Wx2) + ka;2, (5.9) 

where both Qi,Q2 are non-negative symmetric d x d matrices given by 

Qi := 1^ (^Ri{tyWiRi{t) + R3{tyBiU-'B*Mt)yt, 

Q2 ■■= {R2{tyWiR2{t) + Ri{ty BiU-^BlRi{t))dt, 
where C is the dx d matrix given by 

C := ^ {Ri{tyWiR2{t) + R^{tyBiU-^BlRi{t))dt, 
and the vectors , &re given by 



^X2 



:= j^(^Ri{tyWiZx2it) + R3{tyBiU-'B*,Px2it) + Riityx{t;x2))dt, 
:= l^[R2{tyWi Zx2{t) + RiityBiU-^Blpx^it) + ^2(t)* X{t; X2))dt, 



and 



■■= ^ Q {z,2{t),WiZx2{t)) + \ {px2{t),BiU-'Blp,2{t)) + {z,2{t),X{t;x2))^ dt. 

We proceed now as in the proof of Theorem 12.21 Since is smooth, using the linear 
maximum principle, taking the derivative in the xi variable in (|5.9p and using that p = 
i?2(l)-' [^1 - Ri{l)xi - Zx2{l)] yields 

QiXi + Cp + Vx2-Ri{iy{R2{l)-y {Q2P + C*xi + Wx2) = -p 
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and finally Qi-i?i(l)*(i?2(l)"^)*C* = C-i?i(l)*(i?2(l)"^)*Q2+/ = v^,-Ri{l)*w^^ = 0, 
whence (recall that Q2 is symmetric): 

In conclusion, setting D := R2{iy^Riil),E := i^2(l)"^ and F := E*Q2E yields 
Cx2{xi,zi) = ^ (xi,!)^!) - {xi,E [zi - ^2^2(1))) 

+ ^ ((^1 - 2x2(1)), (^1 - 2x2(1))) + (£^(21 - ^X2(1)),WX2) + ^X2, 

which in turn implies (by (|5.3|) ) 

c{{xi,X2), {yi,e'^^X2)) = 

^ {xi,Dxi) - {xi,E [yi - G{l)x2 - 2x2(1))) 

+ l{{yi- G(1)X2 - 2x2 (1)) , ^ (yi - G'(1)X2 - Zx2 (1)) ) 

+ (S(yi-G(l)x2- 2x2(1)), u'x2) + A;x2 + / %X2)dt. (5.11) 

JO 

D and F are symmetric definite positive for the same reason as in the proof of Theorem l2.2[ 
We are now ready to prove the result. For every X2 € H""'^, set 

Mx2 := |x = (xi,X2) € R'^ X I[l""'='|x2 = X2} , 
iVx2 := {x = (xi,X2) € R'^ X I[l"-'^[x2 = e^2^2} , 

and let //q = /o^",Mi = /i^C" with densities /o,/i S -L^(]R") be two compactly supported 
probability measures. 

Fact: A measurable map S satisfies S^hq = /ii and costc{S) < +00 if and only if there 
exists, for almost all X2, a measurable '■ such that 

S{xi,X2) = {Sx2{xi),e^'^X2^ for a.e. xi (5-12) 
and pushes forward the measure /ig^ on defined by 

f^o' ■=fo{;X2) |det(e-^2)| 
to the measure on defined by 

/Xf :=/l (-,6^2X2) dxi. 

Indeed, by Fubini's Theorem, the map xi 1-^ 5'(xi, X2) is measurable for almost all X2 and 
costc(5) < +00 implies 

/ .((.„..).5(.,...))A(.....M.,<+co fora.e,.. 
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and in turn, from ()5.3p . this implies, for almost every X2 G IR*^ , that S'(a;i,a;2) G N. 



X2 



for almost all xi, which means that S{xi,X2) has the form (|5.12|) . Now S^^q = /ii 
means h{x)djjLi{x) = h[S{x))dfio{x) for any h £ L^(]R"'). From Fubini's Theorem 
together with a change of variable y2 = e'^^X2 in the left-hand side, this yields, for any 
positive heL^ (R") 

( / h{xi,X2)fi{xi,X2)dxi] dx2 

= (_/^/((^e-^2j,2(2;i)'y2)) k{xi,e~^-y2) |det(e-^2)|dxi^ dy2 . 

This is equivalent to {Sx2)'i^J'Q^ ~ I^T ™^ proves the fact. 

If S satisfies S'jj/io = fJ-i and costc(S') < +00, the form (|5.12p implies that, for all 
measurable h that depends on X2 only, one has 

and this implies (|2.6p . Conversely, assume now that the measures satisfy (j2.6p : since 

X2 I— 5- / d/iQ^ and X2 i-> / d/^i^ 



h{x2)dfj,i{xi,X2) = / h{e'^^X2)dfioixi,X2), 



are the densities of {e^'^ o 7r2)j/io and (7r2)||/Ui respectively, the measures /ig^ and //^^ 
have, for almost all X2, the same total mass, hence can be considered as probability 
measures after normalisation. Then using (jS.lip and arguing as in the proof of Theorem 
2.'2\ we deduce that for almost every X2 G H""*^, there is a unique optimal transport map 
'■ — > together with a convex function ipx2 • ~^ ^ such that 

Tx2(xi) = (E-^Vipx2{xi),e'^''x2) for a.e. xi G Wi'^. 



This shows the existence and uniqueness of an optimal transport map T, defined by 
T{xi,X2) = {E~^V(px2{xi),e'^^X2) from to /xi (and a fortiori shows ()2.5p ). Note that 
joint measurability of T with respect to the two variables may be seen as a consequence of 
the necessity part of [2, Theorem 3.2] which asserts that any optimal plan is concentrated 
on a c- monotone Borel subset of H" x H". 

If we assume in addition that /q, /i are both continuous and bounded from below and 
above on their support, then each is continuous. This is a consequence of Theorem 
1 applied for each X2 to the mass transportation problem from /Xq^ to fi^^ with respect 
to the cost given by (|5.1ip . But by [151 Corollary 5.23], there is stability of transport 

maps. If {xg} is a sequence converging to X2, the measures /ig^ and //^^ weakly converge 
to and respectively, and if in addition the corresponding costs converge uniformly 
as well, then the transport {T^k} maps converge to Tx2 in probability. Since all T^^ are 
indeed uniformly Holder continuous (see [141 Theorem 50 (i)] or [15\ Theorem 12.50]), and 
references therein) , this concludes the proof of Theorem 12.31 
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6 Examples 



6.1 The case W = 

If W = then the second line in ()4.5p is an independent first order differential equation 
in the p variable; hence Rsit) = for any t G [0, 1] and the form (j4.ip of the cost c. If in 
addition, we assume that ^ = and that the system is controllable, then the matrix B is 
necessarily invertible. In that case, we leave the reader to show that the 2n x 2n matrix 
R{t) has the form 

/„ BU-^B* 

On In 



m 

Then there holds 



c{x,y) = ^(^{BU-^B*) ^(y-x),(y-x)^ Vx,2/Gll". 

And any optimal transport map given by Theorem 12.21 has the form 

T= {BU-^B*) Vip, 

where we used (j4.1ip . with 99 is a convex function. This can also be viewed as a consequence 
of [1] because the above cost is the Euclidean norm corresponding to the positive matrix 
{BU~^B*)~^ and is (^BU~^B*^ S/ip is nothing but the gradient of 99 for this Euclidean 
metric. 

6.2 The case A = 0, B = U = In 

To have a nice form of R{t) let us assume that W is symmetric definite positive. In this 
case, we leave the reader to show that the 2n x 2n matrix R{t) has the form : 



m 

Where 



cosh (tW^) sinh {tW^)W-^ 
W^.sinh (tW^) cosh (tW^) 



00 2n °° j.2n+l 

cosh {tW-2)= —W\ sznh {tw'^)= ^ ^— YT^"^^ 

n=0 ■ ra=0 

And any optimal transport map, given by Theorem 12.21 has the form 

T = {sinh W^.W^) V99, 
where we used ()4.1ip and that 99 is a convex function. 



7 Final comments 

It is worth noticing that the existence and uniqueness part in Theorem 12.21 is not new. 
It is a consequence of [H Theorem 4.1] together with the Lipschitz regularity of the cost. 
The formula (j2.3p implies that the Ma-Trudinger-Wang tensor associated with the cost is 
identically zero. Such a result has been obtained previously by McCann and Lee without 
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computing explicitly the cost (see [101 Theorem 1.1]). We finally observe that ()2.4p means 
that the optimal transport map T is convex up to a linear change of coordinates. Such a 
property is related to the vanishing of the Ma-Trudinger-Wang tensor that we mentioned 
above and [SJ Theorem 4.3]. We refer the interested reader to [71 [TJ] for a more details 
about the Ma-Trudinger-Wang tensor and its link with regularity of optimal transport 
maps. 
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